Boosting Speaker Recognition Performance with Compact Representations

نویسندگان

Sibel Yaman

Jason W. Pelecanos

Mohamed Kamal Omar

چکیده

This paper describes a speaker recognition system combination approach in which the compact forms of MAP adapted GMM supervectors are used to boost the performance of a highdimensional supervector-based system or a combination of multiple systems. The compact supervector representations are subjected to a diagonal transformation to emphasize those dimensions that describe significant speaker information and to deemphasize noisy dimensions. Scores obtained from these representations are then combined with the scores obtained from high-dimensional supervector representations. The transformation parameters and the combination weights are estimated by minimizing a discriminative training objective function that approximates a minimum detection cost function. We carried out experiments on two NIST 2008 Speaker Recognition Evaluation English telephony tasks to compare the proposed approach with direct score combination obtained from lowand highdimensional supervector representations. We have found that the proposed approach yields up to 18% relative gain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

متن کامل

Title Speaker Recognition Using Adaptively Boosted Decision Tree Classifier( Accepted Version )

In this paper, a novel approach for speaker recognition is proposed. The approach makes use of adaptive boosting (AdaBoost) and C4.5 decision trees for closed set, text-dependent speaker recognition. A subset of 20 speakers, 10 male and 10 female, drawn from the YOHO speaker verification corpus is used to assess the performance of the system. Results reveal that an accuracy of 99.5% of speaker ...

متن کامل

Boosting Localized Features for Speaker and Speech Recognition

In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these existing approaches coupled with their proven advantages of robustness and computational efficiency motivated us to apply these ideas to the ...

متن کامل

Text-dependent speaker recognition by efficient capture of speaker dynamics in compressed time-frequency representations of speech

Prevalent speaker recognition methods use only spectralenvelope based features such as MFCC, ignoring the rich speaker identity information contained in the temporalspectral dynamics of the entire speech signal. We propose a new feature called compressed spectral dynamics or CSD for speaker recognition based on a compressed time-frequency representations of spoken passwords which effectively ca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Boosting Speaker Recognition Performance with Compact Representations

نویسندگان

چکیده

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Title Speaker Recognition Using Adaptively Boosted Decision Tree Classifier( Accepted Version )

Boosting Localized Features for Speaker and Speech Recognition

Text-dependent speaker recognition by efficient capture of speaker dynamics in compressed time-frequency representations of speech

عنوان ژورنال:

اشتراک گذاری